Skip to content

Conversation

@naromero77amd
Copy link

These are backports based on these four upstream PRs:

pytorch#163908 (persistent reduction autotune)
pytorch#161280 (reduction)
pytorch#162053 (foreach)
pytorch#163197 (pointwise)

Also included are some additional customer-specific configs which were not upstreamed:

Similar to this backport PR for ROCm fork release/2.9:
#2723

jataylo and others added 19 commits November 14, 2025 22:52
(cherry picked from commit 5d4455f)
(cherry picked from commit d3d77f5)
(cherry picked from commit 2fc7525)
(cherry picked from commit 528cf02)
(cherry picked from commit d5c71f0)
(cherry picked from commit 11e1dfc)
(cherry picked from commit 262a33e)
(cherry picked from commit 0cf1c89)
(cherry picked from commit 9f19754)
(cherry picked from commit dee2fdf)
removed the (erroneous?) check that disables autotuning for pointwise
kernels

(cherry picked from commit e3b8e25)
(cherry picked from commit 10af207)
(cherry picked from commit b9e0182)
Added two nice grid configs for the 2d pointwise kernel cases for WRT5
workload.
Confirmed that they were picked up when using max autotune.

(cherry picked from commit f1eac49)
(cherry picked from commit 2e79001)
(cherry picked from commit 04aa3e4)
This config improves the performance of a 1D pointwise kernel by 20% as
measured on MI350.

(cherry picked from commit a7bac0a)
(cherry picked from commit 0bdb796)
(cherry picked from commit af5f678)
(cherry picked from commit 16e8266)
(cherry picked from commit 8bd33f9)
(cherry picked from commit dfc1579)
(cherry picked from commit 8f60456)
(cherry picked from commit 666e81b)
(cherry picked from commit f6aaaf8)
(cherry picked from commit f97c7a9)
(cherry picked from commit db49466)
(cherry picked from commit 6e9b4ee)
(cherry picked from commit c36d85f)
(cherry picked from commit 0c52d01)
(cherry picked from commit 83e453f)
(cherry picked from commit dd990a3)
(cherry picked from commit 0de435f)
(cherry picked from commit 9534cbd)
(cherry picked from commit 189481e)
(cherry picked from commit 7eeb1ba)
(cherry picked from commit eea659c)
Reorganized slightly the adding of hard-coded autotuning configs.
Fixed wrt1 configs.
Added wrt2 & 3 configs.

(cherry picked from commit e3e9a17)
(cherry picked from commit 6534df0)
@rocm-repo-management-api
Copy link

rocm-repo-management-api bot commented Nov 15, 2025

Jenkins build for 7850a9c97813ff2687769efd9a6c4ff5ff749187 commit finished as FAILURE
Links: Blue Ocean view / Build artifacts

@rocm-repo-management-api
Copy link

rocm-repo-management-api bot commented Nov 15, 2025

Jenkins build for dbdb5542c2ae0f09415495c33bfd7d5d0f77bc53 commit finished as FAILURE
Links: Blue Ocean view / Build artifacts

@naromero77amd naromero77amd changed the title [release/2.7][ROCm][inductor] Inductor heuristic upstream backports [NO CP][release/2.7][ROCm][inductor] Inductor heuristic upstream backports Nov 20, 2025
Added a check that includes autotune configs for 2D POI only if their
size is big enough.

(cherry picked from commit a2b0fd7)
@rocm-repo-management-api
Copy link

rocm-repo-management-api bot commented Nov 20, 2025

Jenkins build for d235a1504f6702249dd72deef1a8f68ce991320a commit finished as FAILURE
Links: Blue Ocean view / Build artifacts

@rocm-repo-management-api
Copy link

rocm-repo-management-api bot commented Nov 20, 2025

Jenkins build for 627a5718c93f8c54fca6787f3167b2b454717226 commit finished as FAILURE
Links: Blue Ocean view / Build artifacts

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants